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1 Introduction. 

S^ . Among the latest fashions in nonparametric statistics are the so-called adaptive esti- 

mations (AE), i.e. estimations that use no a priori information about the estimated 
function. Many publications have recently appeared where AE are constructed which 
are optimal in order at a growing number of current observations on a continuum of 
various functional classes (cf. References for a list of works on AE, which does not, 
however, claim to be exhaustive). 

In (Polyak B., at al., 1990), (Polyak B. at al., 1992), (Golubev G. at al., 1992) for 
instance, AE were constructed for the problem of estimating regression (R) which 
are optimal in order on many subspaces of space L 2 , and non- adaptive confidence 
intervals were elaborated on the basis of the obtained estimations for the estimat- 
ed regression function also in norm L 2 , which later were somewhat improved in 
(Golubev at al., 1992). 

In (Efroimovich S., 1985) AE were constructed for problem (D) of estimating 
distribution density, which are optimal on ellipsoids in L 2 , while in (Golubev, 1994) 



AE were constructed for problem (S) of estimating spectral (smooth) density, and 
so on. 

In numerous publications by D. Donoho et al. (Donoho D at al., 1993(1), 1993(2), 
1996, 1999(1), 1999(2) ) and in some others AE are constructed (and implemented) 
which are optimal in order on a number of Besov spaces. In those papers as well as 
in (Golybev G. at al., 1994), (Nussbaum M., 1985), (Tony Cai at al., 1999), (Lee 
G., 2003) diverse orthonormalized systems of functions are used to construct AE, 
such as wavelets, wedgelets, unconditional bases, splines, Demmler - Reinsch bases, 
Ridgelets (Candes E.J., 2003), (Dette H., 2003) etc. 

The recent results about kernel estimations in the considered problems see, for 
example, (AAD W Van Der Vaart at al., 2003), Allal J., at al, 2003), (Corinne 
Berzin at al., 2003). 

In our present work, as in the previous ones (Ostrovsky E.I., 1996, 1997(1); 
1997(2), 1999) we construct and analyze AE on the basis of the classical apparatus 
of the well known trigonometric approximation theory (Nikolsky S., 1951), (Timan 
A., 1960), (Bernstein S, 1952, 1954). 

The AE proposed herein feature a speed of convergence which is op- 
timal in order on any regular subspace compactly embedded in space L 2 , 
the estimations are universal and very simple in form, which significantly 
facilitates their implementation; finally, we construct exponential adap- 
tive confidence intervals (ACI), i.e. such that the tail of the confidence 
probability decreases with exponential speed. 

To the best of our knowledge, adaptive confidence intervals first appeared in 
our publications (Ostrovsky E. at al., 1996, 1997, 1999). The precursor for the 
present paper is perhaps (Ostrovsky E. at al., 1997); in comparison with it we now 
improve the confidence interval and strengthen the convergence type of random 
values - instead of convergence by probability we establish convergence with unit 
probability; (we stipulate here that all convergences of a random values sequence are 
understood with probability 1 only.) 

2 Problem statement. Denotations. Conditions. 

The following three problems are classical in nonparametric statistics. 

R. The regression problem. Let f(x), x G [0,1] be an unknown function, 
Riemann-integrable with a square and measured at points of net X{ = x i)7l = i/n, % = 
1,2, ...,n with random independent centered identically distributed errors {^} : 
Hi = f(xi) + &. It is required to estimate the function f(x) with the best possible 
precision from the values {yi}- 

D. Estimation of distribution density. On the basis of a set of independent 
identically distributed values {^}, ^e [0,1], i — 1, 2, . . . , n it is required to estimate 
their common density f(x) (assumed to exist). 



S. Spectral statistics. Let {£,} be a Gaussian stationary centered sequence 
with spectral density f(x). The estimation object is f(x). We assume for convenience 
that x G [0,1]. 

It is supposed that all the estimated functions /(•) G L 2 [0, 1], therefore they are 
expanded in the norm of this space into a Fourier series in the complete orthonor- 
malized trigonometric system {(pj(-)} on set [0, 1]: <fi(x) = 1; 

/ > 1 =>• (f2i(x) = v2cos(2irlx); (f2i+i(x) = v2sin(27rte); 

f(x) = J2cm(x); Cj = / ipj(x)f(x)dx. 

Let us set p(N) = p(f,N) = J2'j=n+i c 'j- Evidently lini7v->oo p(N) = 0. Let us also 
assume that only the non-trivial infinite- dimensional case will be considered, when 
an infinite multitude of Fourier coefficients / differs from zero, i.e. ViV > 1 =>- 
p(N) > 0. Otherwise our estimations will converge with speed \j \fn. 

Now let us formulate the exact definition of an asymptotically optimal adaptive 
estimation (ADE), or, more precisely, a sequence of estimations. Let K(6), 9 G 
be some set of Banach subspaces of space L 2 [0, 1]; ( only the case when K(0) are 
compactly embedded in L 2 [0, 1] is non-trivial.) Set 

V(n, 0) = inf sup E\\g(n) - f\\ 2 , 
ffW feK(e) 

where {g(n)} is any sequence of estimations of /. A sequence of adaptive estimations 
f(n) is called asymptotically optimal on the set of classes K(9) ifV9 G G 

sup sup E\\f(n)-f\\ 2 /V(n,9) < oo. 
n feK(e) 

Of course, the quadratic function of losses I = l(g(n),f) = \\g(n) — f\\ 2 can be 
replaced by another loss function, non - negative, monotonically depending on the 
norm ||g(n) — /||, so that 

Ve > 3C(e) < oo, =5> l{z) < C(e) (exp(,2 e ) - 1, z > 0. 

Here is an important example of class K(9). Let 9 = 9(N) be an arbitrary monoton- 
ically non-increasing numerical sequence such that lira. 9 (N) = 0; iV — > oo. Denote 

K{9) = {/, im| 2 (£) = 7 supp(/, N)/9(N) < oo}. 

JV>1 

Relative to the norm || • \\(9) class K{9) is a Banach space compactly embedded in 
L 2 [0, 1], while the inverse is also true: any subspace compactly embedded in L 2 [0, 1] 
is a subspace of some K{9). 

The value p(f, N) is known and well studied in the approximation theory. Name- 
ly) p(f,N) = Ejf 2 (f), where EN iP (f) is the error of the best approximation of 



/ by trigonometrical polynomials of power not exceeding N in L p metrics: for 
g : [0, 1] — > R 1 we shall denote 



i/p 
0Hp=I / \9{x)\ax) .pe[i,oo); 115-1100= sup \g(x) 

x6[0,l] 



\p = (j \g( x )\ dx ) ,pe[l,oo); ||# 
and closely connected with module of continuity of the form 



u>M {k \S)= ^P ||/«(x + /i)-2/«(x) + /W(x-/i)|| p , 

fc:|Jl|<<$ 

(Timan A., 1960, p. 275); arithmetical operations on the arguments of function / 
and their derivatives are understood modulo 1 (periodicity). 
Everywhere below condition (7I) will be considered fulfilled: 

( 7 1) : ImT Jv ^ 0O p(2iV)/p(7V) = f 7 < 1, (1) 

sometimes stronger conditions (7) as well: 

(7): 3 lim p(2N)/p(N) d = f 1 <l; (2) 

( 7 0) : 7 = 0. (3) 

It is easy to show that from condition (1) follows 

p(N) < CN-W, 2/3^1og 2 (l/7)>0. (4) 

In problem (R) it will be assumed that (3 > 1/2. There are some grounds to 
suppose that at /3 < 1/2 asymptotically optimal AE do not exist in the regression 
problem; for a similarly stated problem this was proved by Lepsky (Lepsky O., 
1990). 

Also denote k = max(l,2/3). Here and below the symbols C,C r will denote 
positive finite constructive constants inessential in this context, x is the usually 
symbol, in detail: 

A(n) x Bin) <& d < liminf A(n)/B(n) < 

n—>oo 

lim sup A{n)/B{n) < C 2 , 3Ci,C 2 e (0,oo). 

n^oo 

the symbol A ~ B means that in the given concrete passage to the limit limA/B = 
1. 

Example 1. Denote by W(C,a,/3) a class of functions {/} such that 

p(f,N) ~ CA^- 2/3 (logiV) a , 3C,/3 > 0;a = const. 

Also denote W(a,/3) = U C>0 W(C, a,/3); 

W((3) = W(0,(3); W = Up >0 W(P). 



For the class of functions W condition (7) is fulfilled. It is known from [19, pp. 275, 
353] that / G W(a, (3) if and only if at S -»■ 0+, 5 e (0, 1/2] 

u 2 , 2 {f^\5)~5^\\og5\ a l\ 

Vj<\P\ =>/^(i-o)=/^(o + o), 

where [/5] denotes the integer part of j3 and {[3} = j3 —[/?]. At {/?} = function 
/^(x) is assumed to be continuous. 
Example 2. Let us denote 

Z(a, /?) = {/: p(f, N)~ a (3 N }, a > 0, (5 e (0, 1); 

and Z = U Q> o ; /3 6 (o,i)-Z(a:,/9). For functions of class Z condition (7O) is fulfilled. 
Besides, functions of class Z are analytical [20, p. 129]. 

Denote for the problems R, D, S respectively at j < n dj = (l/n)x 

n n n—j 

(i/n) Y ym( x i); 9 = (i/n) 51 </^(&); 9 = ^ &&+j/(n - j), (5) 

i=l i=l i=l 

j = 0, 2, 4, . . . and Cj = other case; and for the regression problem 



Cj {n) = n 



n 2N 

-^(fjixi), B(n,N)= Y c k (n) 2 + A 1 N/n; 

i=l k=N+l 



Ai = a 2 = D£j; for problem D 



2N 

B(n,N)= Y <? k + A 2 N/n, A 2 = 1; 

fe=W+l 



for the spectral problem 



2iV 

B(n,N)= Y 4 + A 3 N/n, A 3 = 

fc=JV+l 

and again for all the problems set B(n) = 

min B(n,N), N° = N°(n) = argmin B(n,N); 

N=l,2,...,[n/3] N=l,2,...,[n/3] 

A(n, N) = p(N) + A s N/n, A(n) = min A(n, N), 

JV=l,2,...,[n/3] 

where s is the problem number. 

For instance, 

suppose that / e W(C,a,(3), then A(n) x n ~ 2/3//(2/3+1) (logn) a/(2/m) , and in case 
/ e Z(a,(3) => A(n) x log n/n. 



Our notation should not be surprising, as it follows from the Bernstein the- 
orem [20, p. 242] and from condition (7I) that all the introduced functionals 
{B(n, N)}, {B(n)} arising from different problems are mutually x equivalent. Be- 
sides, for the same reasons 

A(n, N) x B(n, N); A(n) x B(n). 

Let us make another additional assumption with regard to the class of estimated 
functions {/} : 

(v) : Vu > 1, VA G [1, N°/v)] U [A ■ v, [n/3]] => 

B{n, A) - B[n) > d(v - 1) 2 (1 + C 2 \v - l^Bin). (6) 

(At v > N° the left interval of (6) is absent, at v > n/(3A°) right interval of (6) is 
absent.) 

The classes of functions satisfying conditions (7I) and (v) will be called regular. 
Classes W and Z are regular. 

Apart from that it is clear that in the regression problem conditions must be 
imposed not only on the estimated function, but on the measurement errors ^ too. 
Two kinds of such conditions will be considered: 

(Rk): 3k = 2,3,...,fi 2k = f Etf k <oc; 

(the power level) and the exponential level: 

(Rq) :3q,Qe (0, 00), =► P(|&| > x) < exp {-{x/Q) q ) , x > 0. 

The classical projective estimates by N. N. Tchentsov (Tchentsov N.N., 1972, p. 
286) will be considered as estimates of the function /: 

N 

f(n,N,x) =J2c j tp j (x). 
i=i 

Since, as shown by Tchentsov, E||/(n, N, •) — /(-)|| 2 x B(n,N), the selection of the 
number of harmonics A^ optimal by order in the sense of L 2 (J1) x L 2 [0, 1] is given by 
the expression N = N°(n) with the speed of convergence f(n,N°, •) — > /(•) in the 



above-mentioned sense JA{n). I. A. Ibragimov and R. Z. Khasminsky (Ibragimov 
I., Khasminsky R., 1982) proved that no faster convergence exists on regular classes 



of functions given by the value JA(n). 

However, the value p(f, N) or at least its order at N — > 00 are practically 
unknown as a rule. Below the adaptive estimation of / will be studied based only on 
observations {£»} and using no apriory information regarding /, and yet possessing 
the optimal speed of convergence at apparently weak restrictions. Set 



2JV 

T (N) = r{n, N) = f ]T c\, N(n) = f argmin r(n, N), (7) 

k=N+l iV6(l,[n/3]) 

r*(n) = min r(n,N), 

JV€(l,[n/3]) 

Onr adaptive estimations f in all considering problems have a universal view: 

N(n) 

f = f(n,N(n),x)= Y,c jVj {x). (8) 

In case of a non-unique number of harmonics N(n) in (7) we choose the largest. 
Below the value N will always be in the set of integers numbers of segment 
1,2,..., [n/3]. 

Before proceeding to formulations and proofs let us clarify informally our idea 
for choosing N(n). It is easy to find by direct calculation that 

Er(n, N) x B(n, N), Dr(n, N) x B(n, N)/n, (9) 

and therefore 



TV -♦ oo, N/n -»■ =>• yJl>T{n,N)/ET{n, N) -»• 0. 

(In the case of the regression problem the condition /3 > 1/2 is essential which is 
common in statistical research (Polyak B. at al., 1990, 1992), (Lepsky O., 1990). It 
follows from (9) that there are some grounds to assume 

r(n, N) x Er(n, N) x A(n, N) 

and therefore 

N(n) = argminr(n, N) ~ argmin Er(n, iV) = N°(n). 

N<n/3 N<n/3 

Also note that the number of harmonics N(n) proposed by us is a random value (!) 
and that estimation (8) is non-linear by the totality of empirical Fourier coefficients 
{c 3 }. 

In the case of problem S our estimation / is homogeneos of degree 2 as a function 
of a initial data {^} but also non - linear. 



3 Formulation of the main results. 

Let us denote 

P / (u) = P(£T 1 (70||/-/|| 2 >u), u>C, Ce(0,oo) 



Theorem R.l(k). If the conditions (7I) and ^4 < 00 is fulfilled in the regression 
problem, then 

P/(w) < Ci A* 4 m" 1 log 2 {C 2 u), u > e/C 2 , 

d= min (XiO.S-X^ + X- 1 )^ 7.221039..., 

xe(o,i/2) 

C 2 = argmin(X(0.5-X)- 2 + X- 1 ) « 0.198340. . .. 

X€(0,l/2) 

This result was proved in (Bobrov P. at al., 1997), but here the values of the constants 
have been corrected. 

Theorem R.2(k). If the conditions (7I) and (Rk) for some k = 3,4,... is 
fulfilled in the regression problem, then 

P/(«) < 2 2fc k k \x 2k u~ k/2 , u>0. 

Theorem R.3(q). In the conditions (Rq), (7I), (v ) in the same problem at u > 
C = 2(1 — 7) _1 Q the following inequality is true: 



P/(w) < 5exp 



-Ci 



N°{n) (u - C)/Q)il^+^ 
I log B(n)\ 



Theorem D. If in addition to the formulated conditions the boundedness of f 
is presumed, then in problem (D) at u > C — (1 — 7) _1 



P/(w) < 5exp 



■C 2 



u - G)N°(n) 
I log B{n) I 



Theorem S. If spectral density f(x) is bounded and conditions (7I), (v) are fulfilled, 
then at u > C = (1 — 7)" 1 



P/(w) < 5exp 



-C, 



w - C)N°(n) 
I log-B(n)| 



Theorem (R.k) a.s. // in problem R condition (Rk) is fulfilled and the series 

00 
J2n~ h/2 A- k / 2+2K (n)<oo, 

n=\ 

converges, then 

lim T*(n)/B(n) = 1, (10) 

and if condition (v) is also fulfilled, then 



lim N°(n)/N(n) = 1. (11) 

n— *oo 

(Recall that the convergence of a r.v. is understood in this paper only with 
probability 1). 

Theorem (Rq) a.s. If in the same problem under condition (Rq) for any e > 
the series 

_ / (nA(n))^ 2 ^\ , 10 , 

^."l" \*m\ ) <0 °- (12) 

converges, then propositions (10) and (11) hold as well. 

Theorem (D)(S) a.s. Let for problems (D),(S), besides the above- formulated 
assumptions, condition (12) also be fulfilled with q/(2q + 4) replaced by 1. Then the 
factual convergences of (10) and (11) are asserted here as well. 
(In comparison with (Bobrov P. at al., 1997) the exponent indices are significantly 
decreased.) 



4 Auxiliary results. 

The technical apparatus for the proofs is the theory of so-called G(if>) — spaces, i.e. 
Banach spaces of random values with rapidly diminishing tails of the distributions 
[16, 23]. For the reader's convenience the necessary information from that theory 
will be provided here without proof. 

A random value 77 determined, like all the other values in the present paper, on 
a fixed probability space, belongs to the space G(ip), where if) = if)(m) is a function 
monotonically increasing on the set m G (1, 00) and finite at at least one value 
m > 1, if the norm 

\\v\\(G(if))) = f sup \r]\ k /iP(k) < 00; \ V \ k = f EV^I* 

k>l 

is finite. If if>(m) = m 1//g , q = const > 0; then the corresponding space will be 
denoted G p ; p = q/(q — l);q — 1 =^> p = +00 and the norm in it ||H|| P ; while 
i] G G p then and only then, if 

3C G (0, 00], Va; > P(M > x) < exp (-Cx q ) . (13) 

Now let i](t), t G T, — be a separable random field, T an arbitrary set, and 
su Pter II^WIIp ^ 1- Introduce a so-called natural metric (more exactly semi-metric) 
d p (t, s) = I \r](t) — r](s) I \ p and denote by N(d p , e) the least number d p of spheres with 
radius e > covering the entire set T. If the so-called entropic integral 



J = f (\ogN(d p ,e)) 1/q de<oo 
Jo 



9 



converges, then 

III sup \rj (t) | ||| P <C 1 + C 2 J. (14) 

teT 

A similar result for spaces Lk(Q) was obtained by G. Pizier (Pizier G., 1979 - 1980.) 

It is asserted that if 

l)3ife> 1 => supE\ri(t)\ k < 1; 
teT 

2)1 = f f 1 N 1 / k (r k ,e)de<oo, 
Jo 

where rk(t, s) = \n(t) — rj(s)\k, then 

I sup|77(*)l |fc<Ci + C 2 7. (15) 

teT 



5 Proofs 

The proofs of the theorems referring to different problems are similar. The assertions 
referring to problem R, which is the most complicated, will be proved below in detail, 
and after that the changes will be indicated that arise in considering problems D 
and S. Some additional notations have to be introduced: for / : [0, 1] — > R 1 and 
p > 2 we shall denote 

ii/iu^j^Ei/o^rj , 

while in the case of p = 2 the index p of the norm sign will be omitted. Further, 
$(N, f, x) = $(iV, x) = $(iV) = £ c m (x) 

3=1 

are partial Fourier sums for the function f(x), 

T(N) = T(N, x) = ®(2N, f, x) - $(7V, x), N < n/3. 
Lemma 1. For all p > 2 



| \T(N) | | p x | \T(N) | \ p4 < CN 1 '^ y/p(f,N). 

Proof. The first assertion follows from the fact that N < n/3 and from the Bern- 
stein inequality [19, p. 245]. The other uses the Nikolsky inequality (Timan A., 
1960, p. 245): 

n 

'Y.m^x^ = ||$(2A0 - *(N)\\h < 2 p \m2N) - *(JV)|g < 



n 



10 



QPN p/2 ^\\^(2N) - $>(N)\\ p p < & > N p l 2 - 1 {p{N) -p(2N)) p/2 < 

QP N P/2-±pP /2 (N). 

Lemma 2. Let us consider on the set S — {1,2, ... ,n} the metric 

d(JV!, JV 2 ) = |p(JV0 - p(N 2 )\ + n-^N, - N 2 \. 

It is asserted that the entropy of the set S in the metric d, i.e. H(S,d,e) 
logN(S r , d, e), e e (0, 1] satisfies the inequality 



H(S,d,s) < C + «|loge|, k = max( 1,2/3). 



Proof. Set K = C 



.-1//J 



and consider <S , (e) - the net 5 in the metric d of the form 



S(e) = [({1, 2,...,K})U (^{[nje/2]})} n 5. 

Calculation of the number of elements in S(e) convinces us of the correctness of the 
lemma. 

The central moment in all the further considerations is the so-called expansion of 
the basic functional r(n,N). In all the three problems under consideration r(n,N) 
is of the form r(n, N) = 

= Er(n, N) + 2* 1 (7V) + * 2 (7V), Er(n, N) ~ B(n, N), 

Etf(iV) = E^ 2 (iV) = 0; V S {N) = V s {n,N), 
where in the case of the regression problem \Pi(JV) = n _1 x 

n n 

xJ2^T(N, Xi ), y 2 (N)=n- 2 J2 E a i;J (n,7V)(^-E^), 

a,ij(n,N) = D 2N (x i ,x j ) - D N (x i ,x j ), 

Dn{x,v) = YljLi Vji^Vjiy) - is the Dirichlet kernel. 

It is easy to obtain by direct calculation for problem R (and then for the remain- 
ing problems) that 

D^i(iV) x p{N)/n, D^ 2 (iV) x N/n. 

Lemma 3. In the regression problem under conditions (Rk) the following inequality 
holds: 

|*i(A0U < C k pH 2k y/p(N)/n, k = 2, 3, . . . . (16) 

Proof. We shall apply the moments inequalities for the sums of centered indepen- 
dent variables {£«}, i — 1, 2, . . . ,n at (Rosental H., 1970), (Johnson W.B., Shecht- 
man G, Zinn J. at al., 1985): p > 2 =>- 

| Y. £ i\v ^ 3 (p/ lo SP) maX (l H £ i\li E \ £ i\ P P ) 1/P 

11 



Here Si = & T(N,Xi), J2i — J27=ii V — 2k. As long as, on the basis lemma 1, 
£ \TnN, Xi ) = n\\T(N)\\l d < Cn\\T(N)\P P < 

i 

CnN p/2 - l \\T(N)\\ p 2 = CnN p/2 - l p p/2 (N), 

we obtain the conclusion of lemma 3. 

Lemma 4. In the same problem and in the same assumptions 

\MN)\k<Cknli k y/N/n. (17) 

Proof. It is sufficient to prove (17) for even k, while for the odd ones it is necessary 
to consider the moment of order k + 1 and make use of the Lyapunov inequality. The 
functional ^(-^O is the quadratic centered form from the random values {£«}, % — 
1,2, ... ,n. In order to estimate its k-th moment we will estimate its cumulant of 
the same order. According to [29, p. 101], 

r fc (* 2 (AQ) kk ( w n \ k - 2 

where r^(^) denotes the k-th semi-invariant of the value £, W n = 



max y^ la,- An, N) I < max Y^ 



27V 
l=N+l 



Analogously to the estimations of the Lebesgue constants in the theory of trigono- 
metrical series we can estimate W n < C log N/n, and consequently at k > 4 

r fe (* 2 (A0) < C k k k ii 2k N- 1 logN • D fc / 2 (^ 2 (iV)). 

Proceeding by the well-known Leonov - Shiryaev formulas (Shiryaev A.N., 1989, 
p. 311) from semi-invariants to moments, we obtain the proposition of lemma 4. 
Lemma 5. Under the conditions of Lemmas 3 and 4 the following inequalities hold 
respectively: 

\r(n,N) - Er(n,N)\ k <Ck ^ ^JA{n,N)/n, 

and if condition (Rq) is fulfilled, then on the basis of the properties of spaces G(ip) 

\\\r(n,N) - Er(n, N)\\\ r < C^A{n,N)/n, r = q/(q + 2). 

The assertion of the lemma 5 it follows from the inequality of the triangle for the 
used norms. 

Let us consider the centered and normalized random field 

((N) = C(n, N) = y/n/(A(n,N) [r(n, N) - Er(n, N)\, 

12 



so that EC(iV) = 0, su Pjv < [n/3] | |CW | | r < C. 
Lemma 6. 



(Rk) => | max \((N) \ k < CA- 2K ' k (n), k > 3k; 



N 



(Rq) => |||max|C(A0| ||| r < C\ log A(n) 



il/r 



The proof will be given for the second assertion alone, as the first is simpler because 
the spaces Lk(Q) are more customary. We obtain on the basis of lemma 5, put 

V(N) = 2tf i(JV) + V 2 (N) = r(n, N) - Er(n, N) : 



»-* (cm - cm) = ^^1^ + 

[^(N,) - *(N 2 )]/y/A(n,N 2 ) = f Ci + C2; 
IHCalllr < CyJ\A(n, N,) - A(n, N 2 )\/(A(n)yff) < 



C^Wh) - p(N 2 )\ + n- 1 !^ - N 2 \/{A{n)y/n), 
since A(n, N) > A(n), a > b > ^ y/a - Vb < y/a-b, 



|||*(7Vi) - # (JV 2 )|||r < Cy/lA^iVO-A^iV^I/n. 
Further, |||Ci|||r < |||#(JVi)||| r x 



l^/A^TVO-^n,^)!! • UAfaNjAfaN, 



-1/2 



< 



CyfWh) - p(N 2 )\ + n-i| JVi - iV 2 |/(A(n)v^), 

since 1 1 1 1 ^ r (-ZVx ) 1 1 j t- < Cj^Jn. The random field C(iV) is thus bounded in the norm 

II • ||L and 



di(7V!,iV 2 ) = J |||C(iVi) - CWHIr < CyJdiN^Nj/Ain). 

Since H{S,d u e) < H(S,Vd/(CA(n)),e) = 

H(S,d,(CeA(n)) 2 ) < d + 2k\ log e\ + 2k\ log A(n)\. 

the assertion of the lemma follows from the properties of the spaces G(ip) (13, 14, 
15). 
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Inequalities (16) and (17) can be reformulated as follows in forms more convenient 
for further application. Under conditions (16) and (17) respectively the sequence 
{r(n, N)} can be expended into 

r(n, N) = Er(n, N) + ^Er(n,N)/n k ^ 2 { k (A(n)y 2K/k v(n, N), 

where Ez/(n, N) = 0, 

supEmax|z/(n, N)\ k = C < oo; (18) 

n N 

and in the other case (Rq) 



r(n, N) = Er(n, N) + ^Er(n,N)/n ■ | log A(n) | 1/r • v(n, N), 
Ev(n,N) = 0; sup ||| max|i/(n, N)\ \\\ r = C < oo. 

n N 

Lemma 7. Let M be some subset of an integral segment S — [1,2, . . . ,n], M — 

S\M, vr(M) = P(iV(n) G M), 

v = v(n,M) d = f inf B(n,N)/B(n) > 2. 

NeM 

Then under conditions (Rq) 

n(M) < 2exp (-C (vnB(n)) r/2 /\ log B{n)\) (19) 

and under conditions (Rk) 

1T ( M ) ^ y k/2 n k/2 Qk/2-2K( n y ( 20 ) 

Proof. We obtain for the case of (Rq), denoting V = max^es \v(n, N)\ : 
%(M) = P(N(n) E M) = P( min r(n,iV) > min r(n,N)) = 



P min (B (n, N) + JB(n,N)/n |logS(n)| 1/r v(n,N)) > 
\NeM v / 



NeM 



> min {B(n,N) + ^B(n,N)/n \logB(n)\ l,r v(n,N)\ < 



P(B(n) + jB(n)/n | logB(n)\ 1/r V > 



> vB(n) - ^vB(n)/n (| logB(n)\) 1/r V). 

We find solving the inequality under the probability symbol relative to V (the case 
of v > C/B(n) is trivial): vr(M) < 



P(C V (1 + Vv)\/B(n)/n \ logB(n)\ 1/r > (v - l)B(n)) < 
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P \v>C 



nB(n) 



v^+1 | log B (n) | Vr 



< P F > C^- 



nB(n) 



log Bin)^ 



Using the estimations of lemma 6, we arrive at inequalities (19) and (20). The case 
of (RK) is considered analogously. 

Proof of Theorem (R.k) a.s. It follows from expansion (18) that 



T*(n) < B(n) + yjB{n)/n C(k) B- 2R l k [n) V, 



therefore 



r*(n) 



Bin) 



- 1 



< 



C(k) 



^ B i/2-2 K /k( n y 



We receive in according to the Chebyshev inequality: 



Pn(e) = P 



T^in 



Bin) 



- 1 



>e < 



C(k) 



pk n k/2j^k/2-2K^ 



11) 



Since for any e > the series ZmP n (£) converges, the first assertion to be proved 
follows from the Borel - Cantelli lemma. The other is proved analogously if it is 
taken into account that iV > N°(n)(l + e), e G (0, 1] and condition (y) lead to the 
inequality B(n, N) > (1 + Ce 2 )B(n), e G (0, 1) and lemma 7 is applied. 

Analogously we can prove the theorem (Rq) 'a.s, on the basis of inequality: 

Pn(e) < exp {-Ce r (nA(n))/\ log A(n)\) . 

Remark 1. Let us note, and use it below, a slight difference in the behaviors of 
the values r(n,N) and N(n) which consists in the peculiarity of condition (v). At 
v > 2 we have (under the same conditions (Rq), (v) : 



max P 



' N(n) 

N°(n) 



V 



' N(n) 
N°(n) 



> v 



< 



exp 



-Cv r 



(nA(n)) r ' T 

\\og A(n)\ 



An analogous estimation for the probability P (r* (n) / B (n) > v) holds even without 
condition (v). 

Remark 2. The consistency of the proposed estimations in the above- 
mentioned sense follows from the assertions already proved. Indeed, since 

A(n) < A(n, [Vn~]) < On' 1 / 2 + p([^\) -^ 0, 

then N°(n) — > oo, N°(n)/n — ► 0, because otherwise the value 

A(n) = A(n, N°(n)) x N°(n)/n + p(N°) 

would not tend to zero. 
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Since N(n)/N°(n) — » 1, then N(n) — ► oo and analogously N(n)/n — *■ 0, which 
proves the consistency of /. 

Proof of Theorem R.3(q). (The previous theorem is proved analogously). 
Note that because of condition (7I) for n > hq > 2 

||/ - f\\ 2 /B(n) = B(n,N(n))/B(n) + * 3 (N(n))/B(n) < 

(1 - 7 )- 1 r*(n)/ J B(n) + * 3 (JV(n))/S(n) = 
(1 - 7)- 1 + {r* in) I B{n) - 1) + * 3 (^(n))/S(n), 
where, as can easily be seen, ^ 3 (N) = 

n N 

Y. v (Ci,Cs), V(x,y) = ^((pj{x) - Cj)((pj(y) - 9), 

l,s=l,l=£s j=l 

and has the same form and the same estimation as ^(-Af). 

Then we will use the elementary inequality P(A) < P(ABC) + P(B) + P(C), 
in which A, B, C are events. Setting A = 

{||/ - f\\ 2 /B(n) > u}, B = {1/v < r*(n)/B(n) < v}, 

C = {N°/v < N(n) < vN°(n), } we have at v e (2, u - C) : 



def 



P(ABC) < P(v + max \V 3 (N)\/B(n) > u). 

N<vN° 



We find analogously to lemma 4: |||\I/3(iV)||| r < 

CVN/n, |||^ 3 (iVi) - *3(N 2 )\\\ r < yH - N 2 \/n, 
and since the entropy integral converges, then (see (15)) 



HI max |*3 (iV) I ||| r < CJvN°{n)/n < Cy/v/JN°{n). 

N<vN°(n) 

Using triangular inequality for the G(ip) — norms we obtain: 



r*(n) 



Bin) 



V 3 (N) 



Bin) 



> v I < exp — C5 v 



r (nA(n)) r ^ 
log A (n) I 



We obtain therefore, based on the properties of spaces G(ip) : 

P < exp (-C % (v' l/2 (u-C-v) ^N°{n)Y) . 

The other probabilities P(B), P(C) were estimated above, and we find by summing 
(C7 = 1/(1 - 7) ) = 

P/(u) = P(A) < exp (- (d(u - C - v)^N°{n)/vY) + 
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+ 4 exp (-C 2 v r,2 (nA(n)) r/2 ) . 

Taking into account that nB{n) > N° and choosing 

v = C±{u — C), C 4 e (0, 1), we arrive at the assertion of the theorem. 

We proceed now to the problem of estimating density (D). The func- 
tional ^(n, N) = \I/(n, N) has in it the following form: 



n 2N 

m{N)=n- l Y, E (W(6)- 

i=l j=N+l 



c% 



Using the Rosental inequality once more, we obtain 

/ 2N > " lk 

E^ 1 (N)) 2k <2C(2k)n~ k E[ E c m (Zi) 

\j=N+l 

2k 

2 n~ k C(2k) I | Y Cj(fj{x) I f(x) dx < 



I E c m( x )\ 

J0 \j=N+l ) 



C ■ C(2k)n- k / J2 c m( x ) I dx > 
since /(x) is presumed to be bounded. Then, since 

2N 

|| E c m (x)\\ = \m2N,x)-^(N,x)\\^0, N-,oo, 

j=N+l 

we have in according to the Riesz theorem (Timan A., 1960, p. 305) 
||$(2iV) - ®(N)\\l k k < C k k k \\$(2N) - $(N)\\ 2k = 

C k k k {p{N) - p{2N)) k < C k k k p k {N), 

so that 

E^j k (N) < C k k k n~ k p k (N). (21) 

In the language of G(ip) - spaces inequality (21) means that 

\\\^i(N)\\\ r <Cp(N)/n. 
It is proved analogously that 

|||*i(iVi) - *i(N 2 )||| r < C|p(iV 1 ) -p(N 2 )\/n. 
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The functional ^(-/V) = ^2{n,N) has the form 

n n 
1=1 s=l 

where 

2N 

U(x,y) = U(N,x,y) = E (fj( x ) - c j)(Vj(y) ~ Cj), 

j=N+l 

and is consequently a so-called U— statistic with the kernel U = U(N,x,y). At the 
same time our U— statistic is singular. The asymptotics of the moments of this kind 
of statistics and the limiting distribution for them are to be found e.g. in (Korolyuk 
V.S. at al., 1989), (Ronzin A., 1982). However, here we need non-asymptotic esti- 
mations from above, and therefore additional reasoning will be required. Note first 
of all that 

E\U(^,^)\ m <C m N m -\ m = 3,4,.... (22) 

Let us prove (22). 



E|f/(6,6)r < 4 m C m f 1 f 1 \D 2N (x,y)\ m f(x)f(y)dxdy < 

Jo Jo 



"1 rl 

'o Jo 

-1 /•! 



<C m f f \D 2N (x,y)\ m dxdy, 
Jo Jo 

where, let us recall, / is bounded and D^ is the Dirichlet kernel. The last integral 

is easily estimated and we arrive at (22). Then on account of the singularity of the 

statistics we have in the case of even k: 



n k 



E^(iV) <J2E---Y,i: Etffo.fci) . . . U&&,,) < 

ii=l.j'i=l i k =lj k =l 



c k k k E E • • • E E ec/ 2 (6i,0i) • • -E« 2 te fe/2 ,o fe/2 ) < 

h = ljl = l »fc/2=lj'fc/2=l 

C k k k \U()- u &)\ k k <C k k k N k / 2 . 

In the case of odd k we consider the moment of order k + 1; in [16, p. 42] the 
equivalence is proved of the norms G(ijj), constructed by even moments alone, to 
the initial norm. 

Thus HI^J^^OHIr < CyN/n, and the further course of reasoning is fully analo- 
gous to the ground for estimation of regression. 

Consider now the problem of spectral statistics (S). It turns out unex- 
pectedly that the reasoning here is even simpler than in problem (D). The fact is 
that the initial sequence {£,} is assumed to be Gaussian, the empirical Fourier co- 
efficients Cfc, i.e. empirical correlation coefficients, are quadratic functionals from 



18 



the trajectory {£«}, i = 1,2, ... ,n, while the functional r(n,N) is a polynomial 
functional of the 4th power and therefore has the expansion 

4 

r(n, N) = Er(n, TV) + £ * m ( n , iV), 

m=l 

where the expansion components are not correlated between themselves and \l/ m can 
be written as an m — dimensional stochastic Ito- Wiener integral according to the 
orthogonal Gaussian measure. At the same time 

4 

CN/n > Br(n,N) = £ ^ m (n,N), 

m=l 

therefore D\I/ m (n, N) < CN/n. The Plikusas theorem (Plikusas A., 1981) asserts 
that the distribution \I/ m (n, N) is estimated only through dispersion: 



|*m(n, JV)||| 2/m < C(m)D 1 / 2 * m (n,JV) < cjN/n~, 



consequently |||r(n, iV) — Er(n, A r )|||i/ 2 < CJN/n. Analogously considering the 
dispersion of the value 



C(iV) = C(n, N) = yJn/A(n,N) [r(n, N) - Er(n, TV)], 

we find that T>((N) < C/B{n) and therefore |||C(W)|||i/2 < C/B(n), and the dif- 
ference C(Ni) — C(N 2 ) is estimated likewise: 

IHC(iVi) - CWIHi/2 < C^d(N~N7)/B(n). 

As a result we obtain for the functional t(ti,N) expansion (18), which is of key 
importance for us: 



r(n, N) = Er(n, N) + JEr(n, N)/n ■ log 2 B(n) ■ V, 



sup III max \u\ l||i/ 2 = C < oo. 

n V<n/3 ' 

The other details of the proof are analogous to the case of regression and ought to 
be omitted. 



6 Adaptive confidence intervals. 

Let us now describe the use of our results for the construction of ACL Note first of 
all that the probability P/(w) with rather weak conditions (except (Rk) ) in all the 
considered problems permits estimation of the form 

P/(«) < 5 exp (-ip(C, N°{n), B(n))u r/2 )) , u>C. (23) 
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As proved above, the values N°,B(n) have respective consistent estimates 
N°(n) ~ argmin r(n, N), B[n) ~ min r(n,N) —T*(n). 

N<n/3 N<n/S 

The value C also depends on 7 and on the constants Cj appearing in the definition 
of condition (v). With very weak conditions they can also be estimated consistently 
by the sampling in the following way. Set M = M(n) = exp(^/logn) ; then, if 
conditions (7), (v) are fulfilled, a system of asymptotic equalities can be written: 

t(M) - A s M/n ~ (1 - 7)p(M); 

r(2M) - 2A s M/n ~ 7(1 - j)p{M); 
r(4M) - 4A s M/n ~ 7 2 (1 - i)p(M). 
Solving this system, we find the consistent (mod P) estimate of 7 : 

r(4M) - 2r(2M) 
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r(2M) - 2r(M) ' 



(The parameter A s can also be estimated consistently, but that is not necessary for 
us). Further, since 

r(n,N(n)(l+v)) B(n,N(n)(l+v)) C x v 2 

r*{n) ~ B(n) ~ 1 + C 2 w' l j 

the constants Ci, C2 can be determined from (24), for instance by the least-squares 
method. Substituting the obtained estimates of all the parameters into (23), we get 
the estimate of the confidence probability 

P f (u) < 5e^(-(l>(C^,C 1 ,C 2 ),N(n),T*(n))u r / 2 ) = f P f (u). (25) 



then, equating the right-hand part (25) of the unreliability of the confidence interval 
5 to, say, the magnitude 0.05 or 0.01, we calculate u = u(S) from the relation 

P f (u(S)) = 5 

and obtain approximately the adaptive confidence interval for / reliability 1 — S of 
the form 

\\f-f\\ 2 <u(8) min r(n,N). (26) 

N<n/3 

But for a rough estimate of the error from replacing / by / the following quite 
simple method can be recommended. Since 

II/-/II 2 _ A(n,N(n)) * 3 (N(n)) 
B(n) B(n) B(n) ' l ' 
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and the second term in the right-hand part of (27) a.s. tends to zero, while the first 
term, if conditions (7), (v) are fulfilled, has 1/(1 — 7) as its limit, we thus prove the 
following assertion apparently well known to specialists in nonparametric statistics 
for non-adaptive estimation: 

Theorem c.i. // the following conditions are fulfilled in our problems: in problem 
R (Rq), (7), (v) or (7), (v) in problems D, S, then 

lim\\f-f\\ 2 /B(n)< 1/(1-7). (28) 

In order to construct an adaptive confidence interval assertion (28) can be reformu- 
lated as follows. With probability tending to 1 at n — > 00 

||/-/|| 2 <5(n)/(l-7), 
and ACI is constructed by replacing the values B(n), 7 by their consistent estimates: 

r(2M) - 2r(M) 



ll/-/ir<r*(n) 



3r(2M)-2r(M) - r(4M) ' 



A more exact result will be obtained by taking into account the following term of 
the expansion of the value | \f — f\ | 2 : 

\\f - f\\ 2 1 c 

r/J 1 < ! + -r== 1 + e n ), 

where e n — > 0; P(|C| > u) < 2exp(— CV"/ 2 ) and C no longer depends on n. Equating 
the probability P(|C| > u ), more exactly its estimate 2exp(— Cu r ^ 2 ) to the value 
S, 8 ~ 0+, we will easily find u = u(S) and construct an approximate ACI with 
reliability ~ 1 — 5 of the form 

ll/-/H 2 <?^ + r.(„) " W 



7 ^JV(n) 

Closer consideration reveals an effect that somewhat reduces the exactness of ACI. 
Let (as is true in all the three considered problems under the formulated assump- 
tions) 

P (||/ - f\\ 2 /B(n) >u)< exp(-0(C lW )), 

P (r*(n)/B(n) < l/u) < exp(-0(C 2 u)), u > C, 
where at u — > 00 =^> <p(u) — >■ 0. We denote 

Q( M ) = P(||/-/|| 2 /r>)> M ). 

Theorem r. At u < C/B(n) the following inequality holds: 

Q(w)<2exp(-0(CV^))- 
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Proof. We have by the full probability formula we (we shall understood P(A/B) 
as the conditional probabilities, if, of course, A and B are events): 

r*(nj i)(n) t>y \-"W V J 

Qi<P (||/ - /HV^H > w/u) < exp(-<f>(C lU /v)y, 

Q 2 <P (T*(n)/B(n) < 1/v) < exp (-<f)(C 2 v)) . 

Summing up and put v = C^^/u, we obtain the assertion of the theorem. 

The increase in the probability Q compared to P/ is apparently explained by 
the ability of the denominator, i.e. r*(n) to take values close to zero. 

Note in conclusion that the estimates proposed by us have successfully passed 
experimental tests on problems R, D, S by simulate modeled with the use of 
pseudo-random numbers as well as on real data (of seismic signals etc.) for which 
our estimations of the spectrum were compared with classical estimates obtained by 
the spectral window method. 
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